AITopics | attraction region

Collaborating Authors

attraction region

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Training Dynamics of Nonlinear Contrastive Learning Model in the High Dimensional Limit

Meng, Lineghuan, Wang, Chuang

arXiv.org Machine LearningJun-10-2024

This letter presents a high-dimensional analysis of the training dynamics for a single-layer nonlinear contrastive learning model. The empirical distribution of the model weights converges to a deterministic measure governed by a McKean-Vlasov nonlinear partial differential equation (PDE). Under L2 regularization, this PDE reduces to a closed set of low-dimensional ordinary differential equations (ODEs), reflecting the evolution of the model performance during the training process. We analyze the fixed point locations and their stability of the ODEs unveiling several interesting findings. First, only the hidden variable's second moment affects feature learnability at the state with uninformative initialization. Second, higher moments influence the probability of feature selection by controlling the attraction region, rather than affecting local stability. Finally, independent noises added in the data argumentation degrade performance but negatively correlated noise can reduces the variance of gradient estimation yielding better performance. Despite of the simplicity of the analyzed model, it exhibits a rich phenomena of training dynamics, paving a way to understand more complex mechanism behind practical large models. Ontrastive learning (CL) is a promising self-supervised strategy to learn semantic representations in an unsupervised way. The model is trained by encouraging representations of different views from the same image to be similar and pushing representations of distinct images apart. This pre-trained model services as a fundamental backbone for downstream tasks, e.g., classification, segmentation and object recognition.

additive noise, learning, training dynamic, (14 more...)

arXiv.org Machine Learning

2406.06909

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Deterministic Approach to Avoid Saddle Points

Kreusser, Lisa Maria, Osher, Stanley J., Wang, Bao

arXiv.org Machine LearningJan-21-2019

Loss functions with a large number of saddle points are one of the main obstacles to training many modern machine learning models. Gradient descent (GD) is a fundamental algorithm for machine learning and converges to a saddle point for certain initial data. We call the region formed by these initial values the "attraction region." For quadratic functions, GD converges to a saddle point if the initial data is in a subspace of up to n-1 dimensions. In this paper, we prove that a small modification of the recently proposed Laplacian smoothing gradient descent (LSGD) [Osher, et al., arXiv:1806.06317] contributes to avoiding saddle points without sacrificing the convergence rate of GD. In particular, we show that the dimension of the LSGD's attraction region is at most floor((n-1)/2) for a class of quadratic functions which is significantly smaller than GD's (n-1)-dimensional attraction region.

attraction region, eigenvalue, saddle point, (15 more...)

arXiv.org Machine Learning

1901.06827

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry: Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Quickshift++: Provably Good Initializations for Sample-Based Mean Shift

Jiang, Heinrich, Jang, Jennifer, Kpotufe, Samory

arXiv.org Machine LearningMay-21-2018

We provide initial seedings to the Quick Shift clustering algorithm, which approximate the locally high-density regions of the data. Such seedings act as more stable and expressive cluster-cores than the singleton modes found by Quick Shift. We establish statistical consistency guarantees for this modification. We then show strong clustering performance on real datasets as well as promising applications to image segmentation.

artificial intelligence, machine learning, quickshift, (17 more...)

arXiv.org Machine Learning

1805.07909

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Add feedback